forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 6
DEEPIN: scsi: Bypass certain SCSI commands on disks with "use_192_bytes_for_3f" attribute #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…es_for_3f" attribute On some external USB hard drives, mounting can fail if "lshw" is executed during the process. This occurs because data sent to the device's output endpoint in certain abnormal scenarios does not receive a response, leading to a mount timeout. [ Description of "use_192_bytes_for_3f" in the kernel code: ] /* * Many disks only accept MODE SENSE transfer lengths of * 192 bytes (that's what Windows uses). */ sdev->use_192_bytes_for_3f = 1; The kernel's SCSI driver, when handling devices with this attribute, sends commands with a length of 192 bytes like this: if (sdp->use_192_bytes_for_3f) res = sd_do_mode_sense(sdp, 0, 0x3F, buffer, 192, &data, NULL); However, "lshw" disregards the "use_192_bytes_for_3f" attribute and transmits data with a length of 0xff bytes via ioctl, which can cause some hard drives to hang and become unusable. To resolve this issue, prevent commands with a length of 0xff bytes from being queued via ioctl when it detects the "use_192_bytes_for_3f" attribute on the device. The hard drive device identified with the issue is Lenovo USB 17ef:4531. Tested on HONOR NBLK-WAX9X (C234) Notebook with AMD Ryzen 7 3700U. [ Kernel logs: ] 2024-10-31 13:36:11 localhost kernel: [ 25.770091] usb 2-2: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:36:11 localhost kernel: [ 25.798558] usb 2-2: New USB device found, idVendor=17ef, idProduct=4531, bcdDevice= 5.12 2024-10-31 13:36:11 localhost kernel: [ 25.798562] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 2024-10-31 13:36:11 localhost kernel: [ 25.798564] usb 2-2: Product: Lenovo Portable HDD 2024-10-31 13:36:11 localhost kernel: [ 25.798566] usb 2-2: Manufacturer: Lenovo 2024-10-31 13:36:11 localhost kernel: [ 25.798567] usb 2-2: SerialNumber: 000000001E4C 2024-10-31 13:36:11 localhost kernel: [ 25.820244] usb-storage 2-2:1.0: USB Mass Storage device detected 2024-10-31 13:36:11 localhost kernel: [ 25.820457] scsi host0: usb-storage 2-2:1.0 2024-10-31 13:36:11 localhost kernel: [ 25.820633] usbcore: registered new interface driver usb-storage 2024-10-31 13:36:11 localhost kernel: [ 25.825598] usbcore: registered new interface driver uas 2024-10-31 13:36:14 localhost kernel: [ 28.852179] scsi 0:0:0:0: Direct-Access Lenovo USB Hard Drive 0006 PQ: 0 ANSI: 2 2024-10-31 13:36:14 localhost kernel: [ 28.852961] sd 0:0:0:0: Attached scsi generic sg0 type 0 2024-10-31 13:36:14 localhost kernel: [ 28.891218] sd 0:0:0:0: [sda] 976773164 512-byte logical blocks: (500 GB/466 GiB) 2024-10-31 13:36:14 localhost kernel: [ 28.906892] sd 0:0:0:0: [sda] Write Protect is off 2024-10-31 13:36:14 localhost kernel: [ 28.906896] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00 2024-10-31 13:36:14 localhost kernel: [ 28.922606] sd 0:0:0:0: [sda] No Caching mode page found 2024-10-31 13:36:14 localhost kernel: [ 28.922612] sd 0:0:0:0: [sda] Assuming drive cache: write through 2024-10-31 13:36:14 localhost kernel: [ 29.007816] sda: sda1 2024-10-31 13:36:15 localhost kernel: [ 30.180380] sd 0:0:0:0: [sda] Attached SCSI disk 2024-10-31 13:36:16 localhost kernel: [ 30.722863] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011 2024-10-31 13:36:16 localhost kernel: [ 30.734139] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011 2024-10-31 13:36:17 localhost kernel: [ 31.396011] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:36:18 localhost kernel: [ 32.933537] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3 2024-10-31 13:36:18 localhost kernel: [ 32.933541] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2 2024-10-31 13:36:39 localhost kernel: [ 54.242220] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:36:50 localhost kernel: [ 64.408879] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:37:11 localhost kernel: [ 85.466479] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:37:11 localhost kernel: [ 85.490248] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK 2024-10-31 13:37:11 localhost kernel: [ 85.490255] sd 0:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 00 00 00 20 00 00 08 00 2024-10-31 13:37:11 localhost kernel: [ 85.490258] print_req_error: I/O error, dev sda, sector 32 2024-10-31 13:37:33 localhost kernel: [ 107.432186] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:37:41 localhost kernel: [ 116.194201] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:37:49 localhost kernel: [ 123.555484] dolphin[7271]: segfault at 10 ip 00007fcccc0d7f76 sp 00007ffe8004b860 error 4 in libKF5CoreAddons.so.5.102.0[7fcccc0a5000+83000] 2024-10-31 13:37:49 localhost kernel: [ 123.555502] Code: d6 90 66 90 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 e8 8e 94 01 00 ba 04 00 00 00 48 89 de 48 89 c7 e8 4e 8f 01 00 84 c0 75 2a <48> 8b 7d 10 48 85 ff 74 21 45 89 e1 48 89 da 48 89 ee 5b 41 b8 01 2024-10-31 13:38:11 localhost kernel: [ 146.229510] usb 2-2: USB disconnect, device number 2 2024-10-31 13:38:11 localhost kernel: [ 146.237993] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238003] print_req_error: I/O error, dev sda, sector 32 2024-10-31 13:38:11 localhost kernel: [ 146.238009] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238029] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238030] print_req_error: I/O error, dev sda, sector 36 2024-10-31 13:38:11 localhost kernel: [ 146.238032] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238045] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238047] print_req_error: I/O error, dev sda, sector 6291480 2024-10-31 13:38:11 localhost kernel: [ 146.238062] Buffer I/O error on dev sda1, logical block 786431, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238168] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238170] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238175] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238176] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238184] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238185] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238199] Buffer I/O error on dev sda, logical block 40, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238201] Buffer I/O error on dev sda, logical block 41, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238205] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238206] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238210] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238211] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238215] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238217] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238220] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238221] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238224] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238226] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:12 localhost kernel: [ 146.482007] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011 2024-10-31 13:38:12 localhost kernel: [ 146.494064] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011 2024-10-31 13:38:15 localhost kernel: [ 150.065848] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3 2024-10-31 13:38:15 localhost kernel: [ 150.065852] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2 2024-10-31 13:38:26 localhost kernel: [ 160.433037] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:39:29 localhost kernel: [ 223.444589] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) Link: https://linux-hardware.org/?id=usb:17ef-4531 Link: https://lore.kernel.org/all/80ef917b-3680-4f85-93ba-c92d2b69ebaa@rowland.harvard.edu/ Link: https://lore.kernel.org/all/ad4bd008-8d0d-439b-879c-e9cf4c89ec56@acm.org/ Link: https://lore.kernel.org/all/4EB8ECD64F601331+e2f01a1f-8da5-4e7b-b909-d920a792756a@uniontech.com/ Reported-by: Xinwei Zhou <zhouxinwei@uniontech.com> Co-developed-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Xu Rao <raoxu@uniontech.com> Tested-by: Yujing Ming <mingyujing@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com>
xry111
pushed a commit
that referenced
this pull request
Jul 14, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) d rm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-0-934f82249f8a@aosc.io/ Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Jul 20, 2025
A crash in conntrack was reported while trying to unlink the conntrack entry from the hash bucket list: [exception RIP: __nf_ct_delete_from_lists+172] [..] #7 [ff539b5a2b043aa0] nf_ct_delete at ffffffffc124d421 [nf_conntrack] #8 [ff539b5a2b043ad0] nf_ct_gc_expired at ffffffffc124d999 [nf_conntrack] torvalds#9 [ff539b5a2b043ae0] __nf_conntrack_find_get at ffffffffc124efbc [nf_conntrack] [..] The nf_conn struct is marked as allocated from slab but appears to be in a partially initialised state: ct hlist pointer is garbage; looks like the ct hash value (hence crash). ct->status is equal to IPS_CONFIRMED|IPS_DYING, which is expected ct->timeout is 30000 (=30s), which is unexpected. Everything else looks like normal udp conntrack entry. If we ignore ct->status and pretend its 0, the entry matches those that are newly allocated but not yet inserted into the hash: - ct hlist pointers are overloaded and store/cache the raw tuple hash - ct->timeout matches the relative time expected for a new udp flow rather than the absolute 'jiffies' value. If it were not for the presence of IPS_CONFIRMED, __nf_conntrack_find_get() would have skipped the entry. Theory is that we did hit following race: cpu x cpu y cpu z found entry E found entry E E is expired <preemption> nf_ct_delete() return E to rcu slab init_conntrack E is re-inited, ct->status set to 0 reply tuplehash hnnode.pprev stores hash value. cpu y found E right before it was deleted on cpu x. E is now re-inited on cpu z. cpu y was preempted before checking for expiry and/or confirm bit. ->refcnt set to 1 E now owned by skb ->timeout set to 30000 If cpu y were to resume now, it would observe E as expired but would skip E due to missing CONFIRMED bit. nf_conntrack_confirm gets called sets: ct->status |= CONFIRMED This is wrong: E is not yet added to hashtable. cpu y resumes, it observes E as expired but CONFIRMED: <resumes> nf_ct_expired() -> yes (ct->timeout is 30s) confirmed bit set. cpu y will try to delete E from the hashtable: nf_ct_delete() -> set DYING bit __nf_ct_delete_from_lists Even this scenario doesn't guarantee a crash: cpu z still holds the table bucket lock(s) so y blocks: wait for spinlock held by z CONFIRMED is set but there is no guarantee ct will be added to hash: "chaintoolong" or "clash resolution" logic both skip the insert step. reply hnnode.pprev still stores the hash value. unlocks spinlock return NF_DROP <unblocks, then crashes on hlist_nulls_del_rcu pprev> In case CPU z does insert the entry into the hashtable, cpu y will unlink E again right away but no crash occurs. Without 'cpu y' race, 'garbage' hlist is of no consequence: ct refcnt remains at 1, eventually skb will be free'd and E gets destroyed via: nf_conntrack_put -> nf_conntrack_destroy -> nf_ct_destroy. To resolve this, move the IPS_CONFIRMED assignment after the table insertion but before the unlock. Pablo points out that the confirm-bit-store could be reordered to happen before hlist add resp. the timeout fixup, so switch to set_bit and before_atomic memory barrier to prevent this. It doesn't matter if other CPUs can observe a newly inserted entry right before the CONFIRMED bit was set: Such event cannot be distinguished from above "E is the old incarnation" case: the entry will be skipped. Also change nf_ct_should_gc() to first check the confirmed bit. The gc sequence is: 1. Check if entry has expired, if not skip to next entry 2. Obtain a reference to the expired entry. 3. Call nf_ct_should_gc() to double-check step 1. nf_ct_should_gc() is thus called only for entries that already failed an expiry check. After this patch, once the confirmed bit check passes ct->timeout has been altered to reflect the absolute 'best before' date instead of a relative time. Step 3 will therefore not remove the entry. Without this change to nf_ct_should_gc() we could still get this sequence: 1. Check if entry has expired. 2. Obtain a reference. 3. Call nf_ct_should_gc() to double-check step 1: 4 - entry is still observed as expired 5 - meanwhile, ct->timeout is corrected to absolute value on other CPU and confirm bit gets set 6 - confirm bit is seen 7 - valid entry is removed again First do check 6), then 4) so the gc expiry check always picks up either confirmed bit unset (entry gets skipped) or expiry re-check failure for re-inited conntrack objects. This change cannot be backported to releases before 5.19. Without commit 8a75a2c ("netfilter: conntrack: remove unconfirmed list") |= IPS_CONFIRMED line cannot be moved without further changes. Cc: Razvan Cojocaru <rzvncj@gmail.com> Link: https://lore.kernel.org/netfilter-devel/20250627142758.25664-1-fw@strlen.de/ Link: https://lore.kernel.org/netfilter-devel/4239da15-83ff-4ca4-939d-faef283471bb@gmail.com/ Fixes: 1397af5 ("netfilter: conntrack: remove the percpu dying list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
KexyBiscuit
pushed a commit
that referenced
this pull request
Jul 23, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Jul 23, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Jul 30, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Jul 31, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Jul 31, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Jul 31, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
Applied from the list instead, thanks! Link: https://lore.kernel.org/all/798FB027101C5650+20250318061125.477498-1-wangyuli@uniontech.com/ |
KexyBiscuit
pushed a commit
that referenced
this pull request
Aug 1, 2025
…ytes_for_3f" attribute On some external USB hard drives, mounting can fail if "lshw" is executed during the process. This occurs because data sent to the device's output endpoint in certain abnormal scenarios does not receive a response, leading to a mount timeout. [ Description of "use_192_bytes_for_3f" in the kernel code: ] /* * Many disks only accept MODE SENSE transfer lengths of * 192 bytes (that's what Windows uses). */ sdev->use_192_bytes_for_3f = 1; The kernel's SCSI driver, when handling devices with this attribute, sends commands with a length of 192 bytes like this: if (sdp->use_192_bytes_for_3f) res = sd_do_mode_sense(sdp, 0, 0x3F, buffer, 192, &data, NULL); However, "lshw" disregards the "use_192_bytes_for_3f" attribute and transmits data with a length of 0xff bytes via ioctl, which can cause some hard drives to hang and become unusable. To resolve this issue, prevent commands with a length of 0xff bytes from being queued via ioctl when it detects the "use_192_bytes_for_3f" attribute on the device. The hard drive device identified with the issue is Lenovo USB 17ef:4531. Tested on HONOR NBLK-WAX9X (C234) Notebook with AMD Ryzen 7 3700U. [ Kernel logs: ] 2024-10-31 13:36:11 localhost kernel: [ 25.770091] usb 2-2: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:36:11 localhost kernel: [ 25.798558] usb 2-2: New USB device found, idVendor=17ef, idProduct=4531, bcdDevice= 5.12 2024-10-31 13:36:11 localhost kernel: [ 25.798562] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 2024-10-31 13:36:11 localhost kernel: [ 25.798564] usb 2-2: Product: Lenovo Portable HDD 2024-10-31 13:36:11 localhost kernel: [ 25.798566] usb 2-2: Manufacturer: Lenovo 2024-10-31 13:36:11 localhost kernel: [ 25.798567] usb 2-2: SerialNumber: 000000001E4C 2024-10-31 13:36:11 localhost kernel: [ 25.820244] usb-storage 2-2:1.0: USB Mass Storage device detected 2024-10-31 13:36:11 localhost kernel: [ 25.820457] scsi host0: usb-storage 2-2:1.0 2024-10-31 13:36:11 localhost kernel: [ 25.820633] usbcore: registered new interface driver usb-storage 2024-10-31 13:36:11 localhost kernel: [ 25.825598] usbcore: registered new interface driver uas 2024-10-31 13:36:14 localhost kernel: [ 28.852179] scsi 0:0:0:0: Direct-Access Lenovo USB Hard Drive 0006 PQ: 0 ANSI: 2 2024-10-31 13:36:14 localhost kernel: [ 28.852961] sd 0:0:0:0: Attached scsi generic sg0 type 0 2024-10-31 13:36:14 localhost kernel: [ 28.891218] sd 0:0:0:0: [sda] 976773164 512-byte logical blocks: (500 GB/466 GiB) 2024-10-31 13:36:14 localhost kernel: [ 28.906892] sd 0:0:0:0: [sda] Write Protect is off 2024-10-31 13:36:14 localhost kernel: [ 28.906896] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00 2024-10-31 13:36:14 localhost kernel: [ 28.922606] sd 0:0:0:0: [sda] No Caching mode page found 2024-10-31 13:36:14 localhost kernel: [ 28.922612] sd 0:0:0:0: [sda] Assuming drive cache: write through 2024-10-31 13:36:14 localhost kernel: [ 29.007816] sda: sda1 2024-10-31 13:36:15 localhost kernel: [ 30.180380] sd 0:0:0:0: [sda] Attached SCSI disk 2024-10-31 13:36:16 localhost kernel: [ 30.722863] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011 2024-10-31 13:36:16 localhost kernel: [ 30.734139] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011 2024-10-31 13:36:17 localhost kernel: [ 31.396011] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:36:18 localhost kernel: [ 32.933537] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3 2024-10-31 13:36:18 localhost kernel: [ 32.933541] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2 2024-10-31 13:36:39 localhost kernel: [ 54.242220] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:36:50 localhost kernel: [ 64.408879] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:37:11 localhost kernel: [ 85.466479] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:37:11 localhost kernel: [ 85.490248] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK 2024-10-31 13:37:11 localhost kernel: [ 85.490255] sd 0:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 00 00 00 20 00 00 08 00 2024-10-31 13:37:11 localhost kernel: [ 85.490258] print_req_error: I/O error, dev sda, sector 32 2024-10-31 13:37:33 localhost kernel: [ 107.432186] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:37:41 localhost kernel: [ 116.194201] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:37:49 localhost kernel: [ 123.555484] dolphin[7271]: segfault at 10 ip 00007fcccc0d7f76 sp 00007ffe8004b860 error 4 in libKF5CoreAddons.so.5.102.0[7fcccc0a5000+83000] 2024-10-31 13:37:49 localhost kernel: [ 123.555502] Code: d6 90 66 90 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 e8 8e 94 01 00 ba 04 00 00 00 48 89 de 48 89 c7 e8 4e 8f 01 00 84 c0 75 2a <48> 8b 7d 10 48 85 ff 74 21 45 89 e1 48 89 da 48 89 ee 5b 41 b8 01 2024-10-31 13:38:11 localhost kernel: [ 146.229510] usb 2-2: USB disconnect, device number 2 2024-10-31 13:38:11 localhost kernel: [ 146.237993] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238003] print_req_error: I/O error, dev sda, sector 32 2024-10-31 13:38:11 localhost kernel: [ 146.238009] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238029] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238030] print_req_error: I/O error, dev sda, sector 36 2024-10-31 13:38:11 localhost kernel: [ 146.238032] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238045] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238047] print_req_error: I/O error, dev sda, sector 6291480 2024-10-31 13:38:11 localhost kernel: [ 146.238062] Buffer I/O error on dev sda1, logical block 786431, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238168] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238170] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238175] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238176] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238184] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238185] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238199] Buffer I/O error on dev sda, logical block 40, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238201] Buffer I/O error on dev sda, logical block 41, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238205] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238206] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238210] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238211] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238215] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238217] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238220] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238221] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238224] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238226] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:12 localhost kernel: [ 146.482007] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011 2024-10-31 13:38:12 localhost kernel: [ 146.494064] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011 2024-10-31 13:38:15 localhost kernel: [ 150.065848] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3 2024-10-31 13:38:15 localhost kernel: [ 150.065852] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2 2024-10-31 13:38:26 localhost kernel: [ 160.433037] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:39:29 localhost kernel: [ 223.444589] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) Link: https://linux-hardware.org/?id=usb:17ef-4531 Link: https://lore.kernel.org/all/80ef917b-3680-4f85-93ba-c92d2b69ebaa@rowland.harvard.edu/ Link: https://lore.kernel.org/all/ad4bd008-8d0d-439b-879c-e9cf4c89ec56@acm.org/ Link: https://lore.kernel.org/all/4EB8ECD64F601331+e2f01a1f-8da5-4e7b-b909-d920a792756a@uniontech.com/ Reported-by: Xinwei Zhou <zhouxinwei@uniontech.com> Co-developed-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Xu Rao <raoxu@uniontech.com> Tested-by: Yujing Ming <mingyujing@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://lore.kernel.org/all/798FB027101C5650+20250318061125.477498-1-wangyuli@uniontech.com/ Link: #7 Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Aug 1, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Aug 3, 2025
…ytes_for_3f" attribute On some external USB hard drives, mounting can fail if "lshw" is executed during the process. This occurs because data sent to the device's output endpoint in certain abnormal scenarios does not receive a response, leading to a mount timeout. [ Description of "use_192_bytes_for_3f" in the kernel code: ] /* * Many disks only accept MODE SENSE transfer lengths of * 192 bytes (that's what Windows uses). */ sdev->use_192_bytes_for_3f = 1; The kernel's SCSI driver, when handling devices with this attribute, sends commands with a length of 192 bytes like this: if (sdp->use_192_bytes_for_3f) res = sd_do_mode_sense(sdp, 0, 0x3F, buffer, 192, &data, NULL); However, "lshw" disregards the "use_192_bytes_for_3f" attribute and transmits data with a length of 0xff bytes via ioctl, which can cause some hard drives to hang and become unusable. To resolve this issue, prevent commands with a length of 0xff bytes from being queued via ioctl when it detects the "use_192_bytes_for_3f" attribute on the device. The hard drive device identified with the issue is Lenovo USB 17ef:4531. Tested on HONOR NBLK-WAX9X (C234) Notebook with AMD Ryzen 7 3700U. [ Kernel logs: ] 2024-10-31 13:36:11 localhost kernel: [ 25.770091] usb 2-2: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:36:11 localhost kernel: [ 25.798558] usb 2-2: New USB device found, idVendor=17ef, idProduct=4531, bcdDevice= 5.12 2024-10-31 13:36:11 localhost kernel: [ 25.798562] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 2024-10-31 13:36:11 localhost kernel: [ 25.798564] usb 2-2: Product: Lenovo Portable HDD 2024-10-31 13:36:11 localhost kernel: [ 25.798566] usb 2-2: Manufacturer: Lenovo 2024-10-31 13:36:11 localhost kernel: [ 25.798567] usb 2-2: SerialNumber: 000000001E4C 2024-10-31 13:36:11 localhost kernel: [ 25.820244] usb-storage 2-2:1.0: USB Mass Storage device detected 2024-10-31 13:36:11 localhost kernel: [ 25.820457] scsi host0: usb-storage 2-2:1.0 2024-10-31 13:36:11 localhost kernel: [ 25.820633] usbcore: registered new interface driver usb-storage 2024-10-31 13:36:11 localhost kernel: [ 25.825598] usbcore: registered new interface driver uas 2024-10-31 13:36:14 localhost kernel: [ 28.852179] scsi 0:0:0:0: Direct-Access Lenovo USB Hard Drive 0006 PQ: 0 ANSI: 2 2024-10-31 13:36:14 localhost kernel: [ 28.852961] sd 0:0:0:0: Attached scsi generic sg0 type 0 2024-10-31 13:36:14 localhost kernel: [ 28.891218] sd 0:0:0:0: [sda] 976773164 512-byte logical blocks: (500 GB/466 GiB) 2024-10-31 13:36:14 localhost kernel: [ 28.906892] sd 0:0:0:0: [sda] Write Protect is off 2024-10-31 13:36:14 localhost kernel: [ 28.906896] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00 2024-10-31 13:36:14 localhost kernel: [ 28.922606] sd 0:0:0:0: [sda] No Caching mode page found 2024-10-31 13:36:14 localhost kernel: [ 28.922612] sd 0:0:0:0: [sda] Assuming drive cache: write through 2024-10-31 13:36:14 localhost kernel: [ 29.007816] sda: sda1 2024-10-31 13:36:15 localhost kernel: [ 30.180380] sd 0:0:0:0: [sda] Attached SCSI disk 2024-10-31 13:36:16 localhost kernel: [ 30.722863] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011 2024-10-31 13:36:16 localhost kernel: [ 30.734139] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011 2024-10-31 13:36:17 localhost kernel: [ 31.396011] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:36:18 localhost kernel: [ 32.933537] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3 2024-10-31 13:36:18 localhost kernel: [ 32.933541] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2 2024-10-31 13:36:39 localhost kernel: [ 54.242220] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:36:50 localhost kernel: [ 64.408879] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:37:11 localhost kernel: [ 85.466479] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:37:11 localhost kernel: [ 85.490248] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK 2024-10-31 13:37:11 localhost kernel: [ 85.490255] sd 0:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 00 00 00 20 00 00 08 00 2024-10-31 13:37:11 localhost kernel: [ 85.490258] print_req_error: I/O error, dev sda, sector 32 2024-10-31 13:37:33 localhost kernel: [ 107.432186] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:37:41 localhost kernel: [ 116.194201] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd 2024-10-31 13:37:49 localhost kernel: [ 123.555484] dolphin[7271]: segfault at 10 ip 00007fcccc0d7f76 sp 00007ffe8004b860 error 4 in libKF5CoreAddons.so.5.102.0[7fcccc0a5000+83000] 2024-10-31 13:37:49 localhost kernel: [ 123.555502] Code: d6 90 66 90 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 e8 8e 94 01 00 ba 04 00 00 00 48 89 de 48 89 c7 e8 4e 8f 01 00 84 c0 75 2a <48> 8b 7d 10 48 85 ff 74 21 45 89 e1 48 89 da 48 89 ee 5b 41 b8 01 2024-10-31 13:38:11 localhost kernel: [ 146.229510] usb 2-2: USB disconnect, device number 2 2024-10-31 13:38:11 localhost kernel: [ 146.237993] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238003] print_req_error: I/O error, dev sda, sector 32 2024-10-31 13:38:11 localhost kernel: [ 146.238009] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238029] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238030] print_req_error: I/O error, dev sda, sector 36 2024-10-31 13:38:11 localhost kernel: [ 146.238032] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238045] scsi 0:0:0:0: rejecting I/O to dead device 2024-10-31 13:38:11 localhost kernel: [ 146.238047] print_req_error: I/O error, dev sda, sector 6291480 2024-10-31 13:38:11 localhost kernel: [ 146.238062] Buffer I/O error on dev sda1, logical block 786431, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238168] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238170] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238175] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238176] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238184] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238185] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238199] Buffer I/O error on dev sda, logical block 40, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238201] Buffer I/O error on dev sda, logical block 41, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238205] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238206] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238210] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238211] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238215] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238217] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238220] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238221] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238224] Buffer I/O error on dev sda, logical block 8, async page read 2024-10-31 13:38:11 localhost kernel: [ 146.238226] Buffer I/O error on dev sda, logical block 9, async page read 2024-10-31 13:38:12 localhost kernel: [ 146.482007] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011 2024-10-31 13:38:12 localhost kernel: [ 146.494064] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011 2024-10-31 13:38:15 localhost kernel: [ 150.065848] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3 2024-10-31 13:38:15 localhost kernel: [ 150.065852] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2 2024-10-31 13:38:26 localhost kernel: [ 160.433037] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) 2024-10-31 13:39:29 localhost kernel: [ 223.444589] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384) Link: https://linux-hardware.org/?id=usb:17ef-4531 Link: https://lore.kernel.org/all/80ef917b-3680-4f85-93ba-c92d2b69ebaa@rowland.harvard.edu/ Link: https://lore.kernel.org/all/ad4bd008-8d0d-439b-879c-e9cf4c89ec56@acm.org/ Link: https://lore.kernel.org/all/4EB8ECD64F601331+e2f01a1f-8da5-4e7b-b909-d920a792756a@uniontech.com/ Reported-by: Xinwei Zhou <zhouxinwei@uniontech.com> Co-developed-by: Xu Rao <raoxu@uniontech.com> Signed-off-by: Xu Rao <raoxu@uniontech.com> Tested-by: Yujing Ming <mingyujing@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Link: https://lore.kernel.org/all/798FB027101C5650+20250318061125.477498-1-wangyuli@uniontech.com/ Link: #7 Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Aug 3, 2025
Similar to the preceding patch for GuC (and with the same references), Intel GPUs expects command buffers to align to 4KiB boundaries. Current code uses `PAGE_SIZE' as an assumed alignment reference but 4KiB kernel page sizes is by no means a guarantee. On 16KiB-paged kernels, this causes driver failures during boot up: [ 14.018975] ------------[ cut here ]------------ [ 14.023562] xe 0000:09:00.0: [drm] GT0: Kernel-submitted job timed out [ 14.030084] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/xe/xe_guc_submit.c:1181 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.041300] Modules linked in: nf_conntrack_netbios_ns(E) nf_conntrack_broadcast(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nf_tables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) snd_hda_codec_conexant(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) nls_iso8859_1(E) snd_hda_intel(E) snd_intel_dspcfg(E) qrtr(E) nls_cp437(E) snd_hda_codec(E) spi_loongson_pci(E) rtc_efi(E) snd_hda_core(E) loongson3_cpufreq(E) spi_loongson_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) gpio_loongson_64bit(E) input_leds(E) rtc_loongson(E) i2c_ls2x(E) mousedev(E) sch_fq_codel(E) fuse(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) xe(E) drm_gpuvm(E) drm_buddy(E) gpu_sched(E) [ 14.041369] drm_exec(E) drm_suballoc_helper(E) drm_display_helper(E) cec(E) rc_core(E) hid_generic(E) tpm_tis_spi(E) r8169(E) realtek(E) led_class(E) loongson(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_client_lib(E) drm_kms_helper(E) sunrpc(E) i2c_dev(E) [ 14.153910] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.165325] Tainted: [E]=UNSIGNED_MODULE [ 14.169220] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.182970] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.189549] pc ffff8000024f3760 ra ffff8000024f3760 tp 900000012f150000 sp 900000012f153ca0 [ 14.197853] a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 0000000000000000 [ 14.206156] a4 0000000000000000 a5 0000000000000000 a6 0000000000000000 a7 0000000000000000 [ 14.214458] t0 0000000000000000 t1 0000000000000000 t2 0000000000000000 t3 0000000000000000 [ 14.222761] t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 t7 0000000000000000 [ 14.231064] t8 0000000000000000 u0 900000000195c0c8 s9 900000012e4dcf48 s0 90000001285f3640 [ 14.239368] s1 90000001004f8000 s2 ffff8000026ec000 s3 0000000000000000 s4 900000012e4dc028 [ 14.247672] s5 90000001009f5e00 s6 000000000000137e s7 0000000000000001 s8 900000012f153ce8 [ 14.255975] ra: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.263379] ERA: ffff8000024f3760 guc_exec_queue_timedout_job+0x1c0/0xacc [xe] [ 14.270777] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 14.276927] PRMD: 00000004 (PPLV0 +PIE -PWE) [ 14.281258] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 14.286024] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) [ 14.290790] ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0) [ 14.296329] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000-HV) [ 14.302299] CPU: 3 UID: 0 PID: 564 Comm: kworker/u32:2 Tainted: G E 6.14.0-rc4-aosc-main-gbad70b1cd8b0-dirty #7 [ 14.302302] Tainted: [E]=UNSIGNED_MODULE [ 14.302302] Hardware name: Loongson Loongson-3A6000-HV-7A2000-1w-V0.1-EVB/Loongson-3A6000-HV-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05756-prestab [ 14.302304] Workqueue: gt-ordered-wq drm_sched_job_timedout [gpu_sched] [ 14.302307] Stack : 900000012f153928 d84a6232d48f1ac7 900000000023eb34 900000012f150000 [ 14.302310] 900000012f153900 0000000000000000 900000012f153908 9000000001c31c70 [ 14.302313] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302315] 0000000000000000 d84a6232d48f1ac7 0000000000000000 0000000000000000 [ 14.302318] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 14.302320] 0000000000000000 0000000000000000 00000000072b4000 900000012e4dcf48 [ 14.302323] 9000000001eb8000 0000000000000000 9000000001c31c70 0000000000000004 [ 14.302325] 0000000000000004 0000000000000000 000000000000137e 0000000000000001 [ 14.302328] 900000012f153ce8 9000000001c31c70 9000000000244174 0000555581840b98 [ 14.302331] 00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d [ 14.302333] ... [ 14.302335] Call Trace: [ 14.302336] [<9000000000244174>] show_stack+0x3c/0x16c [ 14.302341] [<900000000023eb30>] dump_stack_lvl+0x84/0xe0 [ 14.302346] [<9000000000288208>] __warn+0x8c/0x174 [ 14.302350] [<90000000017c1918>] report_bug+0x1c0/0x22c [ 14.302354] [<90000000017f66e8>] do_bp+0x280/0x344 [ 14.302359] [ 14.302360] ---[ end trace 0000000000000000 ]--- Revise calculation of `RING_CTL_SIZE(size)' to use `SZ_4K' to fix the aforementioned issue. Cc: stable@vger.kernel.org Fixes: b79e8fd ("drm/xe: Remove dependency on intel_engine_regs.h") Tested-by: Mingcong Bai <jeffbai@aosc.io> Tested-by: Wenbin Fang <fangwenbin@vip.qq.com> Tested-by: Haien Liang <27873200@qq.com> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Tested-by: Shirong Liu <lsr1024@qq.com> Tested-by: Haofeng Wu <s2600cw2@126.com> Link: FanFansfan@22c55ab Link: https://t.me/c/1109254909/768552 Co-developed-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Shang Yatsen <429839446@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/all/20250613-upstream-xe-non-4k-v2-v2-3-934f82249f8a@aosc.io/ Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
KexyBiscuit
pushed a commit
that referenced
this pull request
Aug 6, 2025
As syzbot [1] reported as below: R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450 R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520 </TASK> ---[ end trace 0000000000000000 ]--- ================================================================== BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62 Read of size 8 at addr ffff88812d962278 by task syz-executor/564 CPU: 1 PID: 564 Comm: syz-executor Tainted: G W 6.1.129-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack+0x21/0x24 lib/dump_stack.c:88 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106 print_address_description+0x71/0x210 mm/kasan/report.c:316 print_report+0x4a/0x60 mm/kasan/report.c:427 kasan_report+0x122/0x150 mm/kasan/report.c:531 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62 __list_del_entry include/linux/list.h:134 [inline] list_del_init include/linux/list.h:206 [inline] f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731 write_inode fs/fs-writeback.c:1460 [inline] __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159 block_operations fs/f2fs/checkpoint.c:1269 [inline] f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668 deactivate_locked_super+0x98/0x100 fs/super.c:332 deactivate_super+0xaf/0xe0 fs/super.c:363 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193 task_work_run+0x1c6/0x230 kernel/task_work.c:203 exit_task_work include/linux/task_work.h:39 [inline] do_exit+0x9fb/0x2410 kernel/exit.c:871 do_group_exit+0x210/0x2d0 kernel/exit.c:1021 __do_sys_exit_group kernel/exit.c:1032 [inline] __se_sys_exit_group kernel/exit.c:1030 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 RIP: 0033:0x7f28b1b8e169 Code: Unable to access opcode bytes at 0x7f28b1b8e13f. RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360 R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360 R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520 </TASK> Allocated by task 569: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x4b/0x70 mm/kasan/common.c:52 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328 kasan_slab_alloc include/linux/kasan.h:201 [inline] slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737 slab_alloc_node mm/slub.c:3398 [inline] slab_alloc mm/slub.c:3406 [inline] __kmem_cache_alloc_lru mm/slub.c:3413 [inline] kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429 alloc_inode_sb include/linux/fs.h:3245 [inline] f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419 alloc_inode fs/inode.c:261 [inline] iget_locked+0x186/0x880 fs/inode.c:1373 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690 lookup_slow+0x57/0x70 fs/namei.c:1707 walk_component+0x2e6/0x410 fs/namei.c:1998 lookup_last fs/namei.c:2455 [inline] path_lookupat+0x180/0x490 fs/namei.c:2479 filename_lookup+0x1f0/0x500 fs/namei.c:2508 vfs_statx+0x10b/0x660 fs/stat.c:229 vfs_fstatat fs/stat.c:267 [inline] vfs_lstat include/linux/fs.h:3424 [inline] __do_sys_newlstat fs/stat.c:423 [inline] __se_sys_newlstat+0xd5/0x350 fs/stat.c:417 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 Freed by task 13: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x4b/0x70 mm/kasan/common.c:52 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1724 [inline] slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750 slab_free mm/slub.c:3661 [inline] kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562 i_callback+0x4c/0x70 fs/inode.c:250 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574 handle_softirqs+0x178/0x500 kernel/softirq.c:578 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164 kthread+0x270/0x310 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 Last potentially related work creation: kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845 destroy_inode fs/inode.c:316 [inline] evict+0x7da/0x870 fs/inode.c:720 iput_final fs/inode.c:1834 [inline] iput+0x62b/0x830 fs/inode.c:1860 do_unlinkat+0x356/0x540 fs/namei.c:4397 __do_sys_unlink fs/namei.c:4438 [inline] __se_sys_unlink fs/namei.c:4436 [inline] __x64_sys_unlink+0x49/0x50 fs/namei.c:4436 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 The buggy address belongs to the object at ffff88812d961f20 which belongs to the cache f2fs_inode_cache of size 1200 The buggy address is located 856 bytes inside of 1200-byte region [ffff88812d961f20, ffff88812d9623d0) The buggy address belongs to the physical page: page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960 head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0 flags: 0x4000000000010200(slab|head|zone=1) raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500 raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0 set_page_owner include/linux/page_owner.h:31 [inline] post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532 prep_new_page mm/page_alloc.c:2539 [inline] get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605 alloc_slab_page include/linux/gfp.h:-1 [inline] allocate_slab mm/slub.c:1939 [inline] new_slab+0xec/0x4b0 mm/slub.c:1992 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180 __slab_alloc+0x5e/0xa0 mm/slub.c:3279 slab_alloc_node mm/slub.c:3364 [inline] slab_alloc mm/slub.c:3406 [inline] __kmem_cache_alloc_lru mm/slub.c:3413 [inline] kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429 alloc_inode_sb include/linux/fs.h:3245 [inline] f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419 alloc_inode fs/inode.c:261 [inline] iget_locked+0x186/0x880 fs/inode.c:1373 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293 mount_bdev+0x2ae/0x3e0 fs/super.c:1443 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642 legacy_get_tree+0xea/0x190 fs/fs_context.c:632 vfs_get_tree+0x89/0x260 fs/super.c:1573 do_new_mount+0x25a/0xa20 fs/namespace.c:3056 page_owner free stack trace missing Memory state around the buggy address: ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== [1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000 This bug can be reproduced w/ the reproducer [2], once we enable CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below, so the direct reason of this bug is the same as the one below patch [3] fixed. kernel BUG at fs/f2fs/inode.c:857! RIP: 0010:f2fs_evict_inode+0x1204/0x1a20 Call Trace: <TASK> evict+0x32a/0x7a0 do_unlinkat+0x37b/0x5b0 __x64_sys_unlink+0xad/0x100 do_syscall_64+0x5a/0xb0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0010:f2fs_evict_inode+0x1204/0x1a20 [2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000 [3] https://lore.kernel.org/linux-f2fs-devel/20250702120321.1080759-1-chao@kernel.org Tracepoints before panic: f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1 f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0 f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0 f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05 f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3 f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0 f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4 f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4 f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0 f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2 The root cause is: in the fuzzed image, dnode #8 belongs to inode #7, after inode #7 eviction, dnode #8 was dropped. However there is dirent that has ino #8, so, once we unlink file3, in f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page() will fail due to we can not load node #8, result in we missed to call f2fs_inode_synced() to clear inode dirty status. Let's fix this by calling f2fs_inode_synced() in error path of f2fs_evict_inode(). PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129, but it failed in v6.16-rc4, this is because the testcase will stop due to other corruption has been detected by f2fs: F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366] F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink Fixes: 0f18b46 ("f2fs: flush inode metadata when checkpoint is doing") Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
On some external USB hard drives, mounting can fail if "lshw" is executed during the process.
This occurs because data sent to the device's output endpoint in certain abnormal scenarios does not receive a response, leading to a mount timeout.
[ Description of "use_192_bytes_for_3f" in the kernel code: ]
/*
The kernel's SCSI driver, when handling devices with this attribute, sends commands with a length of 192 bytes like this:
if (sdp->use_192_bytes_for_3f)
res = sd_do_mode_sense(sdp, 0, 0x3F, buffer, 192, &data, NULL);
However, "lshw" disregards the "use_192_bytes_for_3f" attribute and transmits data with a length of 0xff bytes via ioctl, which can cause some hard drives to hang and become unusable.
To resolve this issue, prevent commands with a length of 0xff bytes from being queued via ioctl when it detects the "use_192_bytes_for_3f" attribute on the device.
The hard drive device identified with the issue is Lenovo USB 17ef:4531. Tested on HONOR NBLK-WAX9X (C234) Notebook with AMD Ryzen 7 3700U.
[ Kernel logs: ]
2024-10-31 13:36:11 localhost kernel: [ 25.770091] usb 2-2: new SuperSpeed Gen 1 USB device number 2 using xhci_hcd
2024-10-31 13:36:11 localhost kernel: [ 25.798558] usb 2-2: New USB device found, idVendor=17ef, idProduct=4531, bcdDevice= 5.12
2024-10-31 13:36:11 localhost kernel: [ 25.798562] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
2024-10-31 13:36:11 localhost kernel: [ 25.798564] usb 2-2: Product: Lenovo Portable HDD
2024-10-31 13:36:11 localhost kernel: [ 25.798566] usb 2-2: Manufacturer: Lenovo
2024-10-31 13:36:11 localhost kernel: [ 25.798567] usb 2-2: SerialNumber: 000000001E4C
2024-10-31 13:36:11 localhost kernel: [ 25.820244] usb-storage 2-2:1.0: USB Mass Storage device detected
2024-10-31 13:36:11 localhost kernel: [ 25.820457] scsi host0: usb-storage 2-2:1.0
2024-10-31 13:36:11 localhost kernel: [ 25.820633] usbcore: registered new interface driver usb-storage
2024-10-31 13:36:11 localhost kernel: [ 25.825598] usbcore: registered new interface driver uas
2024-10-31 13:36:14 localhost kernel: [ 28.852179] scsi 0:0:0:0: Direct-Access Lenovo USB Hard Drive 0006 PQ: 0 ANSI: 2
2024-10-31 13:36:14 localhost kernel: [ 28.852961] sd 0:0:0:0: Attached scsi generic sg0 type 0
2024-10-31 13:36:14 localhost kernel: [ 28.891218] sd 0:0:0:0: [sda] 976773164 512-byte logical blocks: (500 GB/466 GiB)
2024-10-31 13:36:14 localhost kernel: [ 28.906892] sd 0:0:0:0: [sda] Write Protect is off
2024-10-31 13:36:14 localhost kernel: [ 28.906896] sd 0:0:0:0: [sda] Mode Sense: 03 00 00 00
2024-10-31 13:36:14 localhost kernel: [ 28.922606] sd 0:0:0:0: [sda] No Caching mode page found
2024-10-31 13:36:14 localhost kernel: [ 28.922612] sd 0:0:0:0: [sda] Assuming drive cache: write through
2024-10-31 13:36:14 localhost kernel: [ 29.007816] sda: sda1
2024-10-31 13:36:15 localhost kernel: [ 30.180380] sd 0:0:0:0: [sda] Attached SCSI disk
2024-10-31 13:36:16 localhost kernel: [ 30.722863] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011
2024-10-31 13:36:16 localhost kernel: [ 30.734139] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011
2024-10-31 13:36:17 localhost kernel: [ 31.396011] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384)
2024-10-31 13:36:18 localhost kernel: [ 32.933537] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3
2024-10-31 13:36:18 localhost kernel: [ 32.933541] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2
2024-10-31 13:36:39 localhost kernel: [ 54.242220] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
2024-10-31 13:36:50 localhost kernel: [ 64.408879] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384)
2024-10-31 13:37:11 localhost kernel: [ 85.466479] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
2024-10-31 13:37:11 localhost kernel: [ 85.490248] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK
2024-10-31 13:37:11 localhost kernel: [ 85.490255] sd 0:0:0:0: [sda] tag#0 CDB: Read(10) 28 00 00 00 00 20 00 00 08 00
2024-10-31 13:37:11 localhost kernel: [ 85.490258] print_req_error: I/O error, dev sda, sector 32
2024-10-31 13:37:33 localhost kernel: [ 107.432186] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384)
2024-10-31 13:37:41 localhost kernel: [ 116.194201] usb 2-2: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
2024-10-31 13:37:49 localhost kernel: [ 123.555484] dolphin[7271]: segfault at 10 ip 00007fcccc0d7f76 sp 00007ffe8004b860 error 4 in libKF5CoreAddons.so.5.102.0[7fcccc0a5000+83000]
2024-10-31 13:37:49 localhost kernel: [ 123.555502] Code: d6 90 66 90 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 e8 8e 94 01 00 ba 04 00 00 00 48 89 de 48 89 c7 e8 4e 8f 01 00 84 c0 75 2a <48> 8b 7d 10 48 85 ff 74 21 45 89 e1 48 89 da 48 89 ee 5b 41 b8 01
2024-10-31 13:38:11 localhost kernel: [ 146.229510] usb 2-2: USB disconnect, device number 2
2024-10-31 13:38:11 localhost kernel: [ 146.237993] scsi 0:0:0:0: rejecting I/O to dead device
2024-10-31 13:38:11 localhost kernel: [ 146.238003] print_req_error: I/O error, dev sda, sector 32
2024-10-31 13:38:11 localhost kernel: [ 146.238009] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238029] scsi 0:0:0:0: rejecting I/O to dead device
2024-10-31 13:38:11 localhost kernel: [ 146.238030] print_req_error: I/O error, dev sda, sector 36
2024-10-31 13:38:11 localhost kernel: [ 146.238032] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238045] scsi 0:0:0:0: rejecting I/O to dead device
2024-10-31 13:38:11 localhost kernel: [ 146.238047] print_req_error: I/O error, dev sda, sector 6291480
2024-10-31 13:38:11 localhost kernel: [ 146.238062] Buffer I/O error on dev sda1, logical block 786431, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238168] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238170] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238175] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238176] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238184] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238185] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238199] Buffer I/O error on dev sda, logical block 40, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238201] Buffer I/O error on dev sda, logical block 41, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238205] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238206] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238210] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238211] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238215] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238217] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238220] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238221] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238224] Buffer I/O error on dev sda, logical block 8, async page read
2024-10-31 13:38:11 localhost kernel: [ 146.238226] Buffer I/O error on dev sda, logical block 9, async page read
2024-10-31 13:38:12 localhost kernel: [ 146.482007] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x3, stream=0x5, channel=0, format=0x4011
2024-10-31 13:38:12 localhost kernel: [ 146.494064] snd_hda_codec_realtek hdaudioC1D0: hda_codec_setup_stream: NID=0x2, stream=0x5, channel=0, format=0x4011
2024-10-31 13:38:15 localhost kernel: [ 150.065848] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x3
2024-10-31 13:38:15 localhost kernel: [ 150.065852] snd_hda_codec_realtek hdaudioC1D0: hda_codec_cleanup_stream: NID=0x2
2024-10-31 13:38:26 localhost kernel: [ 160.433037] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384)
2024-10-31 13:39:29 localhost kernel: [ 223.444589] start_addr=(0x20000), end_addr=(0x40000), buffer_size=(0x20000), smp_number_max=(16384)
Link: https://linux-hardware.org/?id=usb:17ef-4531
Link: https://lore.kernel.org/all/80ef917b-3680-4f85-93ba-c92d2b69ebaa@rowland.harvard.edu/
Link: https://lore.kernel.org/all/ad4bd008-8d0d-439b-879c-e9cf4c89ec56@acm.org/
Link: https://lore.kernel.org/all/4EB8ECD64F601331+e2f01a1f-8da5-4e7b-b909-d920a792756a@uniontech.com/
Reported-by: Xinwei Zhou zhouxinwei@uniontech.com
Co-developed-by: Xu Rao raoxu@uniontech.com
Signed-off-by: Xu Rao raoxu@uniontech.com
Tested-by: Yujing Ming mingyujing@uniontech.com
Signed-off-by: WangYuli wangyuli@uniontech.com